-
-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Misc] Standardize RoPE handling for Qwen2-VL #9250
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
33a037e
to
992d4cd
Compare
if scaling_type not in {"su", "longrope"}: | ||
scaling_factor = rope_scaling.get("factor", 1.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not clear which RoPE implementations need this, so I've moved this code into the individual if blocks.
5c91da3
to
1e9605b
Compare
@ywang96 PTAL when you have time. |
Can mrope be completed using specialized CUDA operators? According to the profile results, this part is quite time-consuming. |
Can you open a new issue for this? Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left two comments - PTAL!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After testing on Qwen2-VL and Phi-3-medium, I'm fine with these changes since su
and mrope
are both handled with patching correctly.
Signed-off-by: charlifu <[email protected]>
Signed-off-by: Vinay Damodaran <[email protected]>
Signed-off-by: Alvant <[email protected]>
Signed-off-by: Amit Garg <[email protected]>
Signed-off-by: qishuai <[email protected]>
Signed-off-by: Sumit Dubey <[email protected]>
Signed-off-by: Maxime Fournioux <[email protected]>
huggingface/transformers#33401 has been fixed in Transformers v4.45.2, but the devs have clarified that M-ROPE is intended to be configured as
rope_type="default"
. To avoid future compatibility issues, this PR updates the RoPE code in vLLM to follow their specification.